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Abstract 

When an infectious disease strikes a population, the number of newly reported cases 
is often the only available information that one can obtain during early stages of the 
outbreak. An important goal of early outbreak analysis is to obtain a reliable estimate for 
the basic reproduction number, Rq, from the limited information available. We present 
a novel method that enables us to make a reliable real-time estimate of the reproduction 
number at a much earlier stage compared to other available methods. Our method takes 
into account the possibility that a disease has a wide distribution of infectious period and 
that the degree distribution of the contact network is heterogeneous. We validate our 
analytical framework with numerical simulations. 
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1 Introduction 



The basic reproduction number, Rq, is a fundamental characteristic of an infectious disease. It 
is generally defined as the expected number of new infections caused by a typi cal individual dur- 



ing the whole period of his/her iii f ectioii , in a t otally susceptible population (lAnderson fc May 



1991 



Anderson fc Brittoru . 



2000 



Dieta . 



19931 ). Because Rq is a simple scalar quantity, and 



perhaps because in many circumstance s it determines the expec t ed final size of an o utbrea 



(iKermack fc McKendrick 



1927 



Dietz 



1993 



Ma fc Earnl . 



2006 



Arino et al. 



2007 



Brauer 



20081 ) it has been widely used as a gold standard to ga uge the degree of threat that a spe 



cific infectious age nt will pose as an outbreak progresses (IHethcote 



2000 



Eraser et al. 



Lipsitch et al. 



2003 



20091 ). While it is clear that knowing the value of Rq can be very useful to policy- 
makers who are responding to an infectious disease invasion, it is not straightforward to obtain 
a reliable estimate of Rq, especially in early stages of an outbreak before large scale uncontrolled 
transmission has taken place and before the basic biology and transmission pathways of the 
pathogen have been characterized. 

Early in an outbreak, the pattern of disease spread is predominantly under the influence 
of the probabilistic nature of infection transmission. Consequently, a wide array of ultimate 
outcomes is possible, ranging from the outbreak fizzling out even in the absence of intervention 
to circumstances where the initial stage expands into a large-scale epidemic. Once a full-blown 



epidemic develops, several assumptions can b e made that simp 



has been discusse d in d etail in the literature (lAnderson &: May 



2000 



ify t 



l e estimation of Rq, as 



1991 



Ma fc Earn 



Dietzl . 



1993 



Hethcotd . 



20061 ). In many circumstances, however, it is crucial to assess the impact 



of various intervention strategies before a large-scale epidemic occurs. In doing so, stochastic 
manifestations of disease transmission as well as the underlying structure of the contact network 



should be taken into account. The first aspect has been widely studied during the past. For 
example, the Reed-Frost model is a particular case of a chain-binomial stochastic model, where 



each infected individual can infect susceptible individuals inde penden t 



are assumed to have the same contact rate with one another jAbbey 



y, and those individuals 



1952 



Ball 



19861 ). Anot her example is the met hodology developed by Becker (IBecker 



Ball and Donnelly (IBall fc Donnellyl . 



Addv et al. 



1991 



1976h and 



19951 ) based on a branching process susceptible-infected- 



recovered (SIR) model. Branching processes have received wide attention because they facilitate 
the evaluati on of the repr oduction number as well as the final epidemic size and epidemic 



probability (iGuttorp 



199ll ). More recently, the 2003 global outbreak of severe acute respiratory 
syndrome (SARS) inspired the development of new methodologies based on knowledge of the 
daily number of new cases and the distribution of the serial interval between successive infections 



( jWaUinga fc Tennis 



2004 



Cauchemez et al 



2006a 



Unfortunately, none of these methods 



take into consideration the underlying contact network behind the epidemic process. 

In this paper, we propose a new methodology to estimate the basic reproduction number 
together with other useful epidemiological quantities at an early stage of an outbreak, while 
accounting for both the stochastic nature of the transmission process and the underlying struc- 
ture of the contact network. The main parameters used in this approach, i.e., the quantities 
we require in order to apply the method, are the infection rate probability, the removal rate 
probability and the number of newly-infected individuals per unit of time (case-notification 
data). The infection rate quantifies crucial information about the biology of the disease. For 
instance, for a viral infection, the infection rate, in principle, may be related to the viral load 
of infectious individuals. The removal rate indicates how quickly the infectious individuals are 



removed from disease dynamics. This rate can be related to death or various means of interven- 
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tion such as quarantine, reduction of social activity due to severity of illness, behavior change, 
or other factors. The product of infection rate probability a nd the cumulative remova l rate 



probability is related to the generation interval distribution flWallinga Sz Lipsitchl . 



20071 ) (see 



section VI). The three main parameters will be discussed in more detail throughout the paper. 
We will derive a stochastic renewal equation that allows us to understand how the number of 
newly-infected individuals is related to t he preceding number of iii fected cases. Unlike existing 



approaches (e.g., Wallinga and Teunis (IWallinga &: Teunis 



20041 )). our method incorporates 



the degree distribution of the social contact network. We will show how knowledge of the social 
contact network can be utilized to obtain a more reliable estimate of the reproduction number. 

This paper is organized as follows. In Section [2] we introduce the ideas of contact network 
epidemiology and define the concepts of infectious rate, recovery rate and transmissibility. In 
Section [3] we show some examples of the spread of an infectious agent on a contact network. In 
Section H] we outline a general framework to estimate the reproductive number assuming that 
all information about a specific realization of the epidemic process up to time t is known. In 
Section [5] we derive several expressions that allow us to estimate the basic reproduction number, 
Rq, at an early stage of an outbreak and show some examples of the methodology. Finally, 
in section [6] we show th e relationship between our results and the methodology developed by 



20071 ). which is a special case of the framework 



Wallinga and Lipsitch (IWallinga fc Lipsitch 
presented in this paper. 

2 Network basis 



In this section we introduce briefly the ideas of contact network epidemiology and define the 
concepts of infectious rate, recovery rate and transmissibility. We map a collection of N individ- 
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uals to a network in which each vertex represents an individual and each edge shows a pathway 
of possible infection transmission between two individuals. The degree of each individual, i.e., 
the number of contacts that s/he has - or the number of edges emanating from a vertex - is de- 
noted by K, and pk denotes the degree distribution, i.e., the probability that a randomly chosen 
vertex has degree k {k contacts). Several important quantities can be derived once a network's 
degree distribution is known. All moments of the degree distribution can be obtained since 
(k^) = Yl'^=o^'^Pi^- -^o^ n = 1, (k) is the average number of nearest neighbors of a randomly 
chosen individual, which we denote Zi. The average number of second degree n eighbors of a 



2002). To estimate 



randomly selected individual, Z2, can be expressed as {k"^) — {k) ( iNewmanl . 
Rq, we count the number of edges along which an individual can infect others, once that indi- 
vidual has become infected. This quantity, the excess degree, is the number of edges emanating 
from a vertex (individual), excluding the edge that was the sou rce of the focal i ndividuaPs 



20091). 



infection. In fact, the average excess degree is simply the ratio ^ (iNoel et all 

We denote the time at which an individual acquires the infection, by tj, the time since 
acquiring infection hj r = t — tj (also known as age of infection) and the total time that the in- 
dividual can cause infection (time-to- removal) by tn. While harboring the infection, the person 
is first exposed (i.e., infected but not infectious) and then infectious (either symptomatically or 
asymptomatically). The individual may also recover, by which we mean only that s/he can no 
longer transmit the infection, not that s/he has necessarily completely cleared the pathogen; for 
some diseases, after a temporary recovery, the person might become infectious again. Knowing 
that an individual acquires infection at a given time tj, various states of infectiousness of this 
individual can be encapsulated within the hazard of infection or infectivity function, A.j(r). 
The infectivity function measures the instantaneous risk of disease transmission across an edge. 
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This means that the conditional probabihty that infection occurs across and edge between 
times T and t + dr given it did not occurred by time r is equal to Xi{T)dT. Typically Aj(r) 
is zero initially (i.e., during the latent period), increases to a certain level and then declines 
(during the infectious period) before finally vanishing and remaining zero (i.e., in the case of 
permanent recovery or death). Figure [2] shows four hypothetical infectivity functions, the first 
of which is the typical case. In practice, the functional form of Aj(r) should be estimated from 
the actual transmission profile corresponding to a specific disease. Note that because of the 
limited restrictions on the functional form of Ai(r), must be a non-negative integrable function, 
the methodology we present here applies to any staged progression transmission process, such 
as an SEiE2....Emhh----InR process, where 5", Ei and Ij {i = 1, . . . ,m and j = 1, . . . , n ) and 
R are the susceptible, exposed (infected but not infectious), infectious and recovered classes, 
respectively. 

Using the infectivity function, we can evaluate the probability that transmission occurs 
across an edge during a specific time period. In particular, given the removal time of an 
individual, the transmissibility T(r, tn) - the probability that the individual transmitted the 
disease to one of its neighbors befor e r units of time (since acquiring the infection) satisfies 



(I Cox &: Oakesl . 



1984 



Newman 



2002): 



Tir, tn) = 1- exp (- £ \i{u)du) r < tn 

T{t, tn) = T(oo, tn) = 1 — exp ^— J^^^ \i(u)duj otherwise 
Let (f){T,tR) denote the logarithmic derivative of T{T,tR), which we will refer to as the 

infectivity rate i.e., let 

^(^ + \ 1 dT{T, tn) 

^^"''^^ = T(r,t,) dr • 
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Then it follows that T(r, tn) can also be expressed as: 

T{T,tR) = exp ( / (j){u,tR)du ) . (3) 







In general, the time-to-removal period, t^, varies from one individual to another and there is 
no a priori knowledge of the exact value of this quantity for each individual. Therefore we must 
account for its variability as well. Let Xr{tR) denote the removal function or recovery hazard, 
i.e., the instantaneous rate of recovery for an individual units of time after acquiring the 
infection. This means that the conditional probability that an individual is removed between 
times tji and tji + dt^ given it was not removed by time is equal to Xr(tfi) dtR. Let "^Ur) 



denote the probability that an individual has time-to-removal period > tn , then (jCox fc Oakes 



19841 ) 



^(tfi) =exp (^-^ 



tR 

\r{u)du ) , (4) 



subject to the condition \E'(oo) = 0. We define the probability density function ilj{tR) = 

(or ^itn) = Jr^^Pitn)dtn). 
Using equation ([1]) and ipitu) one can calculate the expected value of the transmissibility 
across an edge r units of time after infection as: 

/•oo 

T{t)= / i,{tR)T{T,tn)dtR. (5) 

Finally, we define the overall expected transmissibility, i.e., the transmissibility when we 
wait for an adequately long time (t oo), as 

/■oo 

T = T{oo)= ijitn)Tioo,tn)dtR. (6) 
Jo 
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Figure 1: Four hypothetical profiles illustrating various infectivity functions, Aj(r). 

The reproduction number can now be expressed as Rq = ^T, and represents the average 
number of infections caused by an infected individual before it is removed. 

3 Disease dynamics on networks 

In this section we present some examples of the spread of an infectious agent on a contact net- 
work. The pattern of disease spread on a network can be categorized by three different regimes: 
stochastic, exponential, and decline. The process of disease spread is indeed stochastic in na- 
ture, given that the disease transmission along an edge occurs in a probabilistic manner, and 
that the degree of each newly infected individual cannot be determined a priori. The stochastic 
behavior is dominant in the initial state of disease spread when the number of infectious indi- 
viduals is comparatively small (stochastic regime). The effect of stochasticity becomes much 
less relevant when the number of newly infected individuals dramatically increases, as in this 
case where the stochastic fluctuations (noise) are overwhelmed by a smooth "signal" (expo- 
nential regime). In this regime, the number of new infections occurring at time t, J{t), grows 
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Figure 2: Two hypothetical reahzations of an epidemic process on a network. The left, middle 
and right hand sides represent stochastic, exponential and decline regimes, respectively. 



exponentially and therefore can be expressed as: 



J{t) ~ exp(at). (7) 
Progression of disease spread starts to decline as the total number of infected cases be comes com 



parab le to the size of network. This is when the finite-size effects become important (iNoel et al. 
20091 ) (declining regime). All three regimes are represented in figure [2l 

Figures [3] and H] show two examples of the epidemic behavior in the three regimes. The 
algorithm used to simulate the spread of an infectious agent on a contact network is described 
in the appendix. The left panel of figure [3] shows two epidemic events unfolding on a binomial 
network = ( ^ ) - p)^"^ with zi = 5 {or p = Zi/{N - 1)), = 100,000 nodes 



and {Xi = 0.12771, A,. = 0.25}. We use the tilde notation to make the distinction between 
the realization of a stochastic process (with tilde) and its expected (average) value (without 
tilde). Using equation ([3]) the ultimate transmissibility can be calculated as T = 0.32727. The 
stochastic, exponential and decline regimes are approximately between < t < 20, 20 < t < 40 
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t t 

Figure 3: The number of newly-infected cases (left panel) and its logarithm (right panel) for 
a binomial network with zi = 5, Aj = 0.12771 and = 0.25. Two independent epidemic 
realizations are shown in each panel (green and blue). 

and 40 > t time units, respectively. Also, in the right panel of figure [3l we show the variation of 
ln( J(t)) over time. The two solid lines 7/2 ~ at with ai = 02 = 0.26 are the tangents to the 
curves in the exponential regime. Figure H] shows the same results for an exponential network 
with pfc = (1 — e~^/'^) e"'^/'^ and k = A {z2/zl = 7.0416). In this example, the stochastic, 
exponential and declining regimes are approximately between < t < 10, 10 < t < 20 and 
20 < t time steps, respectively. The lines 2/1,1/2 ~ at with ai = 0:2 = 0.516 are the tangents to 
the curves in the exponential regime. Here, we used epidemic simulation outputs to generate 
"synthetic" time series data. In practice, a can be estimated from real-life time series data 
if the outbreak progresses beyond the stochastic regime. As can be seen in figures [3] and HJ 
the number of newly infected cases at time t, J{t), generally resembles a noisy Gaussian-type 
function if the pattern of disease spread has a chance to grow substantially. 

4 Stochastic dynamics of disease 

In this section we outline a general framework to estimate the reproductive number assuming 
that all information about a specific realization of the epidemic process up to time t is known. 
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t t 

Figure 4: The number of newly-infected cases (left panel) and its logarithm (right panel) for 
an exponential network with k = 4, Xi = 0.12771 and = 0.25. Two independent epidemic 
realizations are shown in each panel (green and blue). 

Let "^{t — T,t)J{t — r) denote the fraction of individuals infected at time t — t that remain 
infectious at time t. As mentioned before, the tilde is used to denote that a quantity corre- 
sponds to a specific realization of the epidemic process. Let Z(t — T,t) denote the average 
(per individual) excess degree at time t. The aggregated excess degree (total number of links) 
of the individuals infected at time t — r that are still infectious at time t is then given by 
J{t — T)'${t — r,t)Z{t — T,t). Let T{t,T,tR > T)(f){T,t{i > r)(ir denote the fraction of these links 
that actually transmit the disease at time t (infectious) . Then the number of new infections 
occurring at time t satisfies the renewal equation: 

J{t)= [ J{t-r)^{t-T,t)Z{t-T,t)<P{T,tR>T)f{t,T,tR>T)dT. (8) 

J{t) is the only dependent variable in the above formula. 

As an outbreak progresses, at any given time there is a population of infected individuals, 
i{t), and a population of removed individuals, TZ{t). The number of affected individuals - the 
total number of infected individuals at time t and those who recovered/removed by time t - is 
given by 

11 




Figure 5: This figure illustrates the dependency of J(t) on its past values. Only a fraction of 
the cases infected at former time t — r, J{t — T), contributes to the infection at the current time 
t, J\t, t) = J it -T)^{t-T,t), while the rest, r) = J{t - r)[l - ^(t - r, t)], have already 
been removed. 



+ 7^(t) = / j(t - T)dT 

Jo 

where the total number of infected cases is given by 



(9) 



X(t) = / J{t-T)'^{t-T,t)dT 

Jo 



which in turn, implies that 



(10) 



7^(^) = / J(t-r) l-^(t-r,t) dr 
Jo ^ 

The total excess degree of removed individuals is given by 



fill 



Z^{t) = / J{t-T)Z{t-T,t) l-^{t-T,t) 

Jo 



dr 



'12) 



It is worth emphasizing that Z^{t) is the total number of edges that could have transmitted 
the infection by already removed individuals by time t, while only a fraction of them actually 
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transmitted the disease successfully. This latter fraction is given by X^{t) + TV'it)^ where 
and TV are the number of infectious and removed individuals at time t, infected by individuals 
who are already removed at time t. In the same manner, one can define X*(t) and TZ^{t) as the 
number of infectious and removed individuals who acquired infection from those who remain 
themselves infectious at time t. Fig. [HI illustrates how individuals in various states infect each 
other, where the arrows show "who is infected from whom" . More precisely, the arrows establish 
the causality link by showing where the superscript index (the predecessor) may come from. 
For instance, a recovered/removed predecessor, denoted by superscript r in W\t), can only 
come from a group whose members were recovered while their predecessors were still infectious 
(7^*), or from a group in which the person and his/her predecessor are removed (denoted by 
the self-loop from TV to itself). Similarly, the recovered predecessor (r) of an infected person 
(X''(t)), may only come from a group of recovered individuals whose respective predecessors 
were infectious {TZ^{t)) or from a group whose respective predecessors were already recovered 
{IV it)). The expected transmissibility of the removed individuals T^{t) at time t can thus be 
defined as 

It is not possible to obtain a closed form equation for and TV in terms of the above-mentioned 
quantities, because the information about who gets infected from whom is not included in 
equation ([8]). However, estimates for the expected values for these quantities can be calculated 
and are given in the next section. This will allow us to estimate the basic reproduction number 

Rn. 
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Figure 6: This figure illustrates various options of causality between infected and recovered 
cases. Letter (I or R) denotes the current status of an individual, and the superscript (i 
or r) correspond to the status of the person who passed on the infection to this individual. 
X^'(t) + 71^ (t) constitutes the number infections caused by the already removed individuals. 

5 Quasi-stochastic dynamics of disease spread 

In this section we derive several expressions that allow us to estimate the basic reproduction 
number, Rq, at an early stage of an outbreak and present some examples of the methodology. 
Before going into the details, we introduce the following identity equation, which is helpful in 
establishing a link between the concept in the previous and the current section: 



Although the previous formalism is very general, in almost all practical situations, it is 
very difficult to obtain Z{t — T,t), unless one is able to perform a very precise contact-tracing 
investigation. As a result, we prefer to approximate the excess degree per individual by the 
"expected excess degree", i.e., Z(t — T,t) ~ ^. It should be noted that this approximation 
is only valid when the following three conditions are met: a) the effect of degree correlation 




(14) 
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between neighbors is neghgible; b) the number of infected individuals is large enough to provide 
adequate sample size; and c) the number of infected individuals is small enough comparing to 
the network size. Since our objective is to calculate Rq at an early stage, these assumptions 
remain completely plausible. It should also be noted that the expected excess degree at t = 
is Z{0) = Zi and rapidly tends to ^ as time progresses. This leads to a more precise equation 
Z{t — r,t) ~ ^("^(i) — l)f^ + which might be more suitable when iZ{t) is rather 

small. As before, we also assume that the time to removal has the probability density ip{t^. 
Replacing Z{t — r, t) by ^ in ([H]), we obtain a new expression for the number of new infections 
at time t: 

J{t)= [ J{t-r')^{T)Ro{t,T)<P{T,tR>T)dT, (15) 

where Ro{t, r) = T{t, r, t_R > r)^ is the only independent random variable. 
The expected number of infectious and removed individuals are given by 

X{t) = f ^{t) [ Jit - T')dT'dT (16) 

Jo Jo 

and 

n{t) = ^(r) ^ Jit - T')dT'dT (17) 

Figures [7] and [8] show the number of infectious (left panel) and removed (right panel) individuals 
for a disease that spreads on binomial and exponential networks, respectively. Each panel shows 
three curves. The curve comprising of red circles corresponds to the computer simulation of an 
epidemic on the network; during the simulation the new case counts are recorded to create a 
synthetic "time series" for Jit). The blue curve corresponds to the closed form equations (fT6!) 
or (fT71) (for X(t) or 7?.(t)), and the green one corresponds to the scenarios when the identity in 
equation (1141) is used. These figures show a perfect agreement between these three curves. 
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Figure 7: Number of infectious and removed individuals as a function of time for the binomial 
network {zi = 5, = 0.12771 and A,. = 0.25). 
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Figure 8: Number of infectious and removed individuals as a function of time for the exponential 
network (k = 4, Aj = 0.12771 and A^ = 0.25). 
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It should be noted that in the quasi-stochastic analysis, since Z{t — T,t) is approximated 
by ^, equation f|T2l) takes the following simple form 

Z'(t) = ^TZ(t) (18) 

The expected transmissbility of removed individuals T''"(t) can be obtained as an extension 
of Eq. by replacing the removal rate distribution ip{tfi) with the conditional distribution 
of removal given it occured before time t, defined as ip^(t,tR). The quantity ■ip^(t,tR)dtR is 
proportional to the number of individuals already removed by time t that were removed after 
exactly tn units of time, i.e., ijj^{t,tji)dtR oc ip{tR) Jq J(t')dt'dtji. This probability function, 
after incorporating the proper normalization, can be written as 



rit,tn)dtn = ^(tn) , J' ' dtn (19) 

Jo ^(^r) Jo J{t')dtRdt' 

The expected transmissibility of removed individuals can then be calculated from 



T-{t) = [ ^\tM)T{tR)dtR (20) 

^0 



We now define the reproduction number of the removed individuals as 



Rl{t) = r\t)^. (21) 

Combining the last equation with equations (fT3ll and ( fTSl) one obtains that 
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f'it)^ = iwm. (22) 

zi n{t) 

Now, considering the symmetry between different compartments in figure [6], it is straight- 
forward to derive the following equation: 

(23) 

_ X\t)+TV'{t) 

m 

Using equation (l22l) . the basic reproduction number Rq = can then be estimated as 

On the right hand side of the last expression, the expected value of X'"(t) + TZ^it) can be 
calculated as 

T\t)+n'\t)= f I J{t')r]{t,t',T)C{t',T)dTdt' (25) 
Jo Jo 

where ri{t, t', r) is the fraction of infected individuals who are removed by time t and might 
have infected others at time t' <t ; ({t', r) is the probability function that an infection at time 
t' was caused by any of these individuals. Let J^{t,T) = J(t — r)^(r), then T]{t,t',T) can be 
written as 

J^(t',r) ^(r) 
Fig. [9] illustrates the concept. It shows how the new infection rate at time t' is due to the 
infectious rate J^{t',T') a fraction of whom J'^{t,T) stays infectious at a later time t. It is 
straightforward to write ({f; r) as 
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Substituting the expressions of r] and ( in Eq. fl25l) we obtain 

T{t) + n''{t) = [ j{t')dt' 

Jo 

which only depends on case notification data and on the disease transmissibihty. For a disease 
with constant removal rate A,., we have r + t — t') = J(t' — T)ip'^{T)ip^{t — t') and as a 
result T'^it) + 7V{t) = 7l{t). Interestingly, in this case we also obtain X*(t) + TZ^it) = I(t) and 
7^*(t) = This means that the first fraction on the right hand side in Eq. (|24l) equals unity, 

and therefore T^{t)^ = 1. Then the expression for -Ro(^) takes the simpler form 

Ro{t) = (29) 



I dT{T,tR>T) 

dr 



dr 



dT{T,tll>T) 

dr 



dr 



(28) 
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Figure 10: The expected reproduction number of removed individuals for the binomial (left 
panel; Zi = 5, Aj = 0.12771 and = 0.25) and exponential (right panel; k = 4) networks in 
terms of the number of removed individuals. 

5.1 Numerical examples 

To test the framework presented so far, we performed epidemic spread simulations on the two 
networks introduced earlier (binomial and exponential) and in each case collected the "time 
series" of case counts resulted from the simulations. In Fig. [10] we present the reproduction 
number of the removed individuals, RQ{t), as a function of the number of removed individuals, 
TZ(t). The symbols show the results from direct counting during the simulation, whereas the 
lines show the results obtained from evaluating analytically each of the terms in Eq. 0221) based 
on the estimated infection rate. The green and red colors correspond to the left and right 



hand side of Eq. (|22|) . respectively. The discrepancy at the very early stage is mainly due to 
approximating Z{t) with — . The small asymptotic devia tion of our estimate for the exponential 



network comes from finite size effects (INoel et al. 



20091). 



Fig. [TT] shows the expected reproduction number for the binomial (left panel) and the 
exponential (right panel) networks. The "real" values are Rq = 1.6864 and 2.3749, respectively. 
It is worth noting that T^'{t) is a functional of Xi{t), Xrit) and J{t) (see equations fl20|) and 



([in])). Therefore, when A^ is constant, the condition T'^(t)^ = 1 allows one to evaluate, from 
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Figure 11: The expected reproduction number for the binomial (left panel; zi = 5, Aj = 0.12771 
and Xr = 0.25) and exponential (right panel; k = 4) networks in terms of the number of removed 
individuals. 
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Figure 12: Estimates for the value of ^ for the binomial (left) and exponential (right) networks, 
when Aj and Xr are known. 

a given time series, one of the three quantities A,, A,, or — , if the other two quantities are 
assumed to be known (this statement holds even if A^ is a function of time, in which case a 
more elaborate equation (!22|) should be used). For instance, we simulated again the epidemic 
on the binomial and exponential networks presented earlier, but this time with constant values 
for Xr and Aj. Using these values and the obtained J{t), we calculated the value of ^, which 
is presented in figure (|T2l) for the binomial (left panel) and exponential (right panel) networks. 
The green line shows our estimates using the above condition, while the red line is calculated 
directly from the network structure. 
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Figure 13: Estimates for the value of Aj for the binomial (left) and exponential (right) networks, 
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As a second example, we used the same epidemic simulations, but this time we used as 
input the values for A^ and ^ that were derived directly from simulations and estimated the 
value of Aj which is shown in figure [13] for the binomial (left panel) and exponential (right 
panel) networks. Here again, one would observe very good agreement between the value of A, 
used in the simulations (black line) and its estimated value from the condition T'^{t)^ = 1 (red 
circles). 

6 Exponential regime and generation interval distribu- 
tion 



In this section we show t he relationship between ou r results and the methodology developed 



by Wallinga and Lipsitch (IWallinga &: Lipsitch 



20071 ). which is a special case of the framework 



presented in this paper. In the exponential regime, one can ignore the stochastic fluctuation of 
the quantities and replace them with their expected values, since the stochastic effects become 
much less pronounced. In particular, we can re-write equation (1151) by ignoring the stochastic 
effects as following 
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J(t)= / Jit-T)^{T)Ro{T,tn>T)^{T,tR>T)dT 



(30) 



We define the expression inside the bracket as xi^')^ which is the expected infection rate 
per infected individual. Equation fl30|) then takes the following simple form 



Jit) 



At - r')x{T')dT\ 



(31) 



which is the well known Lotka renewal equation (ILotkal . 11939 



Kot 



2002) 



From the definition of x{'t) it is clear that x{'^)dT = Rq- In the exponential regime, 
after replacing J(t) ~ exp(— at), we obtain 



exp(— ar')x(r')(ir' 



(32) 



It is worth mentioning that the exact exponential regime can be reached in the limit t — oo 
for an infinite-size network and that is why the upper limit in the above integral is set to oo. 
This means that there should be a slight deviation from the exponential behavior for a "finite- 
size" system at "finite time" once the outbreak has surpassed the stochast ic regime. Following 



the line of argument in Wallinga and Lipsitch (iWallinga fc Lipsitchl . 



20071 ) we find that 



1 r°° 

— = / exp(-aT)x(r)(ir, (33) 
-Ko Jo 

where xi^) = x{'^)/Ro is the generation interval distribution. This quantity relates the ba- 
sic reproduction number to the Laplace transform of the generation interval distribution in 
the asymptotic case (infinite-size, infinite-time). It is straightforward to obtain the following 



equation 
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This equation describes the relationship between the generation interval distribution and 
the derivative of transmissibility with respect to the duration of infection. 

Equation (132|) has a simpler form for constant and Aj. This can be obtained by replacing 
iP{t) = Arexp(— A^t) and T{t',t) = 1 — exp(— Ajt') inside equation (132!) 



This leads to a = 0.26084 and a = 0.51626 for the binomial and exponential networks, respec- 
tively, which are in excellent agreement with results shown in figures [3] and HI 

7 Conclusions 

Using concepts from network theory, we presented a method which provides a reliable estimate 
of the basic reproduction number, Rq, at an early stage of an outbreak. We provided the details 
of calculations and compared our results at each step against simulations. Case notification 
data (time series) is the main input to this analytical framework. As an outbreak begins to 
unfold, the pattern of spread depends substantially on the structure of the underlying contact 
network. This dependency, in fact, manifests itself in the formation of the time series of newly 
infected cases. The proposed methodology highlights the interplay between the heterogeneity 
in contacts (network structure), the estimates of the basic reproductive number and infection 




(35) 



^2 K 

Zl \i + \r + a 



(36) 



or 




(37) 
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transmissibility. Depending on the circumstances, this methodology can be used to infer other 
useful quantities as well. For infectious pathogens that cause repeated outbreaks, there is 
enough empirical evidence to establish the distribution of the duration of infectiousness as well 
as the recovery rate of individuals. In this case, in addition to the basic reproductive number, 
the proposed methodology can shed light on the structure of the underlying contact network 
by estimating the ratio ^. This is an important piece of information, because in many circum- 
stances it is not possible to capture and build a detailed contact network among individuals 
based on some network generative rules. The importance of this quantity becomes more ap- 
parent when an emerging infectious disease strikes a population. In this circumstance, there is 
much less information on the characteristics of the disease such as the duration of infectiousness 
and recovery rate, which in turn determine the transmissibility of disease. Knowledge of the 
transmissibility of a disease during an early stage of an epidemic can play a crucial role, as 
effective and cost-effective public health intervention strategies hinge on the degree of conta- 
giousness of a disease. On the other hand, before the spread of disease becomes rampant, the 
structure of the contact network within a population remains more or less stable. Therefore, 
the estimated value of ^ obtained during epidemic lulls from the time series corresponding 
to common infections can be used to estimate the transmissibility of an emerging infectious 
disease at an early stage of an emerging infectious outbreak. We demonstrated this concept by 
two examples. Our estimate for the reproduction number very quickly tends to the real value, 
thus enabling epidemiologists and policymakers to identify the optimal control strategies in real 
time. 
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A Appendix 



To perform Monte Carlo simulations of epidemic propagation on a network, one fir st requires 



an explicit know 



2002 



Noel et al. 



edge o f that network structure. We have used the following method (INewman 



20091 ) to produce a network related to given degree distribution: (i) Generate 
a random degree sequence kj of length N subjected to the degree distribution p^. (ii) Make sure 
that kj is an even number since a link is composed of two "stubs." (iii) For each j, produce 
a node with kj stubs, (iv) Randomly choose a pair of unconnected stubs and connect them 
together. Repeat until all unconnected stubs are exhausted, (v) Test for the presence of self- 
loops and repeated links. Remove the faulty stubs by randomly choosing a pair of connected 
stubs and rewire them to the former stubs. Repeat until no self-loops and/or repeated links 
are found. 

To simulate the spr ead of di s ease on a contact network in continuous time we follow a 



Tau-Leaping approach (iGillespid . 



2001 



Gillespie fc Petzoldl . 



2003 



Higham 



20081 ) ■ which we 



describe below. The processes of disease transmission along one edge and the removal of 
infectious individuals are controlled by Aj(r) and Ar(T), respectively. We divide time into 
intervals of length At and ensure that A.j(r)At and \r{T)At are small enough, such that the 
expected epidemic curve does not vary much by reducing At even further. At every At step, 
each infectious individuals recovers with probability Ar(rj)At, where Tj is the age of infection 
of individual j. If an infectious individual does not recover, then it infects independently each 
of its susceptible contacts with probability Aj(rj)At. 
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